Bagging and the Random Subspace Method for Redundant Feature Spaces

نویسندگان

Marina Skurichina

Robert P. W. Duin

چکیده

The performance of a single weak classifier can be improved by using combining techniques such as bagging, boosting and the random subspace method. When applying them to linear discriminant analysis, it appears that they are useful in different situations. Their performance is strongly affected by the choice of the base classifier and the training sample size. As well, their usefulness depends on the data distribution. In this paper, on the example of the pseudo Fisher linear classifier, we study the effect of the redundancy in the data feature set on the performance of the random subspace method and bagging.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Two credit scoring models based on dual strategy ensemble trees

Decision tree (DT) is one of the most popular classification algorithms in data mining and machine learning. However, the performance of DT based credit scoring model is often relatively poorer than other techniques. This is mainly due to two reasons: DT is easily affected by (1) the noise data and (2) the redundant attributes of data under the circumstance of credit scoring. In this study, we ...

متن کامل

Ensemble classification based on generalized additive models

Generalized additive models (GAMs) are a generalization of generalized linear models (GLMs) and constitute a powerful technique which has successfully proven its ability to capture nonlinear relationships between explanatory variables and a response variable in many domains. In this paper, GAMs are proposed as base classifiers for ensemble learning. Three alternative ensemble strategies for bin...

متن کامل

Attribute bagging: improving accuracy of classifier ensembles by using random feature subsets

We present attribute bagging (AB), a technique for improving the accuracy and stability of classi#er ensembles induced using random subsets of features. AB is a wrapper method that can be used with any learning algorithm. It establishes an appropriate attribute subset size and then randomly selects subsets of features, creating projections of the training set on which the ensemble classi#ers ar...

متن کامل

Weighted random subspace method for high dimensional data classification.

High dimensional data, especially those emerging from genomics and proteomics studies, pose significant challenges to traditional classification algorithms because the performance of these algorithms may substantially deteriorate due to high dimensionality and existence of many noisy features in these data. To address these problems, pre-classification feature selection and aggregating algorith...

متن کامل

A New Hybrid Framework for Filter based Feature Selection using Information Gain and Symmetric Uncertainty (TECHNICAL NOTE)

Feature selection is a pre-processing technique used for eliminating the irrelevant and redundant features which results in enhancing the performance of the classifiers. When a dataset contains more irrelevant and redundant features, it fails to increase the accuracy and also reduces the performance of the classifiers. To avoid them, this paper presents a new hybrid feature selection method usi...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره شماره

صفحات -

تاریخ انتشار 2001

Bagging and the Random Subspace Method for Redundant Feature Spaces

نویسندگان

چکیده

منابع مشابه

Two credit scoring models based on dual strategy ensemble trees

Ensemble classification based on generalized additive models

Attribute bagging: improving accuracy of classifier ensembles by using random feature subsets

Weighted random subspace method for high dimensional data classification.

A New Hybrid Framework for Filter based Feature Selection using Information Gain and Symmetric Uncertainty (TECHNICAL NOTE)

عنوان ژورنال:

اشتراک گذاری